Oops. Does it squeeze in, aye, like that? Yeah. Okay. Right. Yep. It's going uh. Okay. Sure. just said the same things you just said.. So how we're getting along? T uh I wanted to talk about that actually. Um this speaker, um the data processing is fine, but uh we don't particularly want to do the b the the b the GUI for it. No, not really. If someo if you wanna do that, then then tell me how you want the data presented, how the 'cause Do you want me to tell you? Okay, like, 'cause at the moment, the the the there is I've created two classes, one that represents speakers, one that represents the meetings, and the meet and the information about both is contained within each object. So and then they wr it writes objects and the objects contain all the information about the meetings and the speakers. So that the who the speakers that are at the meetings and the amount they speak, and then the averages are contained with the speakers. So there's two separate class, aren't there. And they all they're two different objects, and you can recall they can write the it writes the objects and then you call the objects back and they ha those returned objects have all the information that they need, and then you can call methods to return whatever you want. Or everything, but that's why I wanted to know how ho what's the easiest way to have the data. Yeah, supposedly all calculated, yeah. Um and all stored as objects, so dot object files. Um which means that you just ret call constructor and call the load thing and call it there and you can create a list of them or a vector of them or whatever you wanna do, yeah. And then or just call them one at a time to populate window. But that's what I wanted to how y how what format do you want the data to come back in. 'Cause it can come back as a almost anything. What's easiest to display on the screen. Yeah. Yeah. Yeah, ca you could Yeah. Yeah. Yeah. Yeah. So um so I can have it so that it returns you a 'cause at the moment the main data structures are hash tables for the meetings that say it's got one that says percent it's called percent talk. One's percent noise and one is percent participation. Do you see what I mean? So and then it's got inside it's got a link for it's got a it's got the n speaker's name, and then it's got their percentage for that thing. So it can either come back as a you can have the hash table or you can have it returned as a vector, and it will say noise. Uh just a string is noise w X_ percent. It will say vector this cou I don't know, whatever. W you can either have it th you could either have it you could i have it like an embedded in vector or array of strings and each one represents one person or whatever. What the what the easiest thing dec how how you wanna display it. Okay. Yeah, exactly. Exactly. Yeah. See that's the ca How did we Can I have a look at that again? Okay. S If it's in that format, it's speaker. It's speaker speaker is the uh controlling thing, not yeah, yeah, okay, yeah, yeah. Yeah. So all of this uh all that's calculated as well, stored as speaker objects. Yeah, that's that's easy. Some are quite amusing actually. The uh the influences of I lived in Germany for six months, don't know if that had any effect. Just spend too much time talking to Brits. That was bizarre thing., where did these people come from. So then the all that's calculated as well. All I have to do is get the dialogue acts. I don't think that'll be difficult. Um So then does this how does is how is this box populated? Is it populated by the one present or the one highlighted? The one present, okay. 'Cause at the moment I'm using there is there are methods that say um for the using, for other talks. So that's and it says like get get talk time, and you and y it takes a name, so that w could call that would call that and call the meetings method that said return that, and that would populate that, which should be an easy thing to do. Um and the same for that, comes out there. Meetings. Yeah, yeah. Yep. Yeah yeah, that would be easy as well, yeah. Yeah, okay. So I'll w I'll just leave in lots of methods that st that'll just return one number at a time. That'll be the easiest way to do, yeah. Okay, yeah, that's fine. Yeah. Get talk time I think it's called at the moment, something like that, yeah. No, it's stored as an object file. It processes a whole lot off-line and stores it as an object, and then they're much much smaller. They're only like a one um one thou one K_. Yeah, it's all pre-processed. And then it's just each method object's got a return bunch of return methods. You have to g re-create the object. It's got a load method. So what you do is you call a null constructor, 'cause if you call the th proper constructor for each meeting, it goes off and does all the processing and stores own object. And then um if you call a null constructor, then you pr call load and you can call load and whatever one or all of them or anything like that. Go through a list. Yeah. One object for each meeting. Yeah. Although yeah, yeah. Although you c Yeah. It's just tell it tells you who participated. At the moment it tells you who participated and the amount they participated in percentages, and in time as well. Yeah yeah, that'll be pro that'd be easy, yeah. Yeah, if that won't be too difficult. But that would be that would cause a problem with anything that wasn't annotated for topics. Oh yeah. No, topic specified, yeah have a default, yeah. There are some there are some default actually. The um a lot of people don't get their own ch and other stuff. But Yeah. Okay. Right. Y oh yeah, that wouldn't be a problem. And then you could do a search over the meetings over the objects. 'Cause that's the thing, these these are so small, they can load each o all of the whole lot up and do a search of the whole lot to find who by who and what problem or what topics were in what. So it doesn't crash the thing. What the global statistics come straight off of that, don't they. 'Cause they're just for the meter met th the The speaker class knows about all of that stuff, and the meeting class knows about that stuff. Um No no, but I don't think that'll be too difficult. What we want, yeah. I thought the other stuff was more important anyway, so I did that first. Mm no, design the GUI first, and uh 'cause it it w the problem is if you change the classes, it the object's serial numbers change and you can't re-load the object, so all the processing has to be done over again. And I haven't quite finished it. So It would become out of synch and get a bit funny. If I gave you one if I gave you it one and you worked on it, and then I changed it and run the thing again, you wouldn't ever be able to load the objects back up. And you'd have to and then you'd have a v multiple copies of objects all over the place and it'd get silly, I think. But if you can if you wanna make the picture c you can do that without anything, I'm presuming. Ju just the That, yeah. The text box. 'Cause that doesn't Okay. Alright. Well it won't take very long to get it all finished. But I think I'll need to have done all of this stuff too first. 'Cause otherwise the objects won't be the same. No, it's on my home directory. Yeah. No, I haven't global yet. When it's finished up global, otherwise it would get confusing. That way it doesn't crash if you try and load all ten in one it crashes, doesn't it. It it's a bit dumb, if you can fool it, if you c if you c load up ten different si engines simultaneously, it can do that fine. But it can't do them it can't do them if it thinks one, 'cause then it's about the amount that each if you you have to kinda call a new class and then it will do it fine, but if you don't then it won't l Yeah, it says okay, oh yeah, I've got all this space, you can use some. Otherwise it goes oh no. It talks to me, yeah. Say nice machine, it goes Oh, they're done, are they? Okay, cool. They pop up. Uh yeah. Welcome to the L_S_ N_L_S_D_ browser. Some some some speech and some music, some drum rolls. Yeah. Yeah. Yeah. Yeah. Or a switch-board that comes up that's just a blank form like that with some buttons on. Load me a meeting, load me a search, load me something else. Whistle a tune. Yeah. Hmm. I that's cheaper than X_T_ search. It would need you to have a meeting loaded before it will start doing any searching at all, doesn't it? You can't the the only thing you can search is a NITE object model, and the only time you get one of those if you've loaded an observation. Yeah, which we but doesn't that way c you cou you could use the inverted file search to return a list of of meetings and then use one of those to load a search. But you won't It has to be an observation, and even if you go and se you can go and search the whole corpus from that, but you have to have it has to start with something for some bizarre reason. The engine there's only one there's only one search method to the search engine. Uh the engine class only has one search method. Yeah, yeah, globally, yeah. Yeah. Yeah. Yep. It doesn't take long to load up anyway. You can load a dumb one up that doesn't have any Exactly, yeah. Yeah. But only gives one at a time anyway, doesn't it? 'Cause otherwise you'll crash the thing. Yeah. It's not too slow though, that thing. It shouldn't it's not it's not too bad on that. I don't think that will be a much of a problem. Yeah yeah. Yeah. Yeah, yeah. Ma make it strings for as long as possible, and then only return the things when they actually needs to has to search. When it needs to be loaded. Yeah. Yeah. Yeah. Did we think about um better names for the meetings? Oh do they do they re-translate them? Do they? Okay, well that's alright then. We'll just use that then. Well that's what's the that's the working group, is it? Okay. I wanna see the meetings about even better understanding. Okay, that's cool, that's good. So who would the the um I_D_F_s? I mean the D_F_s. The document frequencies for each word in the corpus. Yeah, to do um what Steve's talking about you do. To do the topic labelling. If somebody's done the keywords or the the g I_D_F_s or the D_F_s already would. Can't you do any better for our search without the T_F_I_D_F_? I think you need to. It's the amount that they occur over documents. Basically, the amount they D_F_ is the document frequency is the amount that each word occurs um no, what is it? Term frequency is the amount that that it occurs in ea one of them is the amount it occurs in each document and the other one is the amount it g occurs generally. So if you so the more it occurs in specific documents, compared compared to its general score, the bet more informative it is about a certain The corpus, the corpus, still data. Yeah. No. Yeah. Yeah. But plus a stop list, so you remove stuff that doesn't ta it like yeah, and then the, which is gonna a a prob basically equal score. Or a massive Bunch of key-words. That'd probably be easiest thing. Key-words. Yeah, key-words, three f three, five words. In both documents. Yeah, term frequency inverse document frequency. I did do it once, I do have a Java class that does it for something, I don't know whether it'll work with this. But Yeah. Yeah. Yeah. also key-words gives you a a whole new type of search. You do keyword search. But you could do key-word search could be topic search, can they can be the same thing. Instead uh it would just search for key-words when it when they you tell him that with topics, but actually get searching with key-words. For each do you see what I mean? But I suppose even calculating the the w the the what's-its-faces themselves would be too much too long. The easy bit is it's probably the easiest to calculate them based upon in their whole occurrences in i in the corpus than it is to calculate them per topic, 'cause you don't have to integrate as much information. No, you can do you can do search without T_F_I_D_F_, you just can't rank the search. Yeah. No no no, but that's what isn't that what the idea was in the first place to rank these rank the results so that Yeah, but that won't slow it down. Ranking it won't slow it down. Yeah. It still uses an inverted file, but it ranks the results by the amount by the higher yeah. I thought that was part of it, but yeah, okay, it doesn't matter if it's not. Um No, we did Yeah. But yeah, I guess uh if you do if if that's not part of it, don't worry about it, it doesn't 'Cause I'm only gonna do this if I've got time anyway. So Yeah. Well it'd just give you a rank. It would that was the whole point was to if you say, this is your top one, this is your bottom. Yeah. But say but it's How how informative? That T_F_I_D_F_ is an informative score, isn't it. So Depends how you treat your compound nouns. like what? As a compound noun. Uh Sunny day, yeah. And like an adjective, yeah. Um in its most simple form it would do a separate rank for each one, each term. You could make it more complicated and make it do for th for the yeah. Yeah. You can just add 'em up, or you can Yeah, yeah. Yeah, you don't wanna start looking for bo Yeah. Um I guess you just do a sum of the um of the the individual T_F_I_D_F_ for each term returned, and that generally will be a bit crude, but it will give you a d score, and the higher the more uh more informative each term is for each thing would give you a a thing. It's pretty crude anyway, but it's just looking for um if it's all it's gonna do is look for six separate c oh, 'cause then it's gonna go into the N_X_T_ search and return that, isn't it. So mm Yeah. Yeah, yeah, that's true. So yeah, that's less crude isn't it. But um Groups of terms. Yeah, without doing any like um word pairs, which is just omission. Yeah, I don't know how that works. That's how I remem Yeah. But then the idea is, that gives you an informative score. How you combine that is is up to you. I guess it there's lots in the literature. I if s if you were is there's a lo whole load about it in Manning and Schutz. So They've got a whole chunk about it's so just I_R_, isn't this. Basic information retrieval. They've got a big a good chapter on it. If you haven't got it, it's on Cognate., yeah. Yeah. Yeah. Me too. Hundreds of P_D_F_s. Yeah. Okay. Do you want that on the start-up screen, yeah? 'Cause Yeah, I guess so. Yeah, yeah, I guess that's useful. Uh you can have a save preferences. You could have a save preferenc preferences, I guess. Well alright, call it favourites then. You can have a favourites. Yeah, it's not enough information, is it, to. Yeah, yeah. Yeah. Yeah, it would be quite good if it has yeah. Just a b search buttons, so just Mm. Yeah yeah. Doofus mode. Or search, yeah, yeah. So if even if it just had two things, just said one sai one said take me straight to this meeting and have a m text-box you can enter it, or a drop down menu. And then another that said search that loaded instantly. It loaded up the search screen. Yeah. Yeah yeah, or one that yeah, or one that Yeah. Yeah. Or the other thing to do is just have search as the default. Just it opens and the search window opens. And that's the interface, and you just go from there. And then that brings up the browser after you f searched for something. Or or the search but the search window could have on it something that said just has a drop down menu that says just and a go button that said take me to this. And so the f so the in yeah, so the f only thing that comes up when you're finished is a um when you start it's just one window like that and it's got all the search stuff like down there. And so this is your search. It's just all here, and here is just go to wherever and a go button. And then from there it takes you to wherever else you wanna go. Yeah. And the other topic says welcome to welcome to our browser. A drop drop-down menu. Oh, you wanna have oh yeah, the users. They're always nonsense, yeah. Yeah, that's true. Yeah, it does. This is good, but if you wanna search a search, if you wanna look for one meeting and just look at it, then that's fine. Um that's that's true, that's or unless you have two. One one is one one there where you got two, one for meeting, one to speak. Or and you can choose, you can go go for go or go for the other. Go on go for both. If you go for both, you're searching Yeah. We could do Microsoft stylie and hold it over and it pops up a thing. Is that complicated? Mouse over, isn't it, or something. Actually in this n Oh let's do that then. You get out of the bloody way, I'm trying to do a search the damn thing. Yeah, that's true, that's annoying. But that's why you could just have a list of your users then. And just you just say I wanna look for this user. Go. Find me. Find me, then then then it pulls up a list of all the ones who got that user in it. And then you search then but then you didn't search. Maybe just leave it, just have them there, and don't worry about the speakers. If they're doing speakers, they're doing search. It's not the same as doing a a quick access. Yeah. Or you can have a f text box there that's got yeah, as you go over them. Then that doesn't get in your in the way. Yeah. The full name and then speakers. Yeah. We can do all of that without even ever going anywhere near loading up a I think. Oh cool. And uh a thing. A meeting, a nom. Yeah. I think it's in the right object model, but I'm not sure. No uh yeah, is you're right, it's the right corpus, yeah. It's the not right corpus and then you got not right elements in it. Got not right attributes in Yeah. I think the N_ ones are interfaces and the NITE one one of the ways round one's an interface and one's actually an implemented class. Uh 'Cause you go back enough and the um the what's its name is not very good. The A_P_I_ is alright, but there's not a lot of description in it. It's very crude. It tells you what Yeah. Yeah. Yeah. So you end up just it saying returns an N_ text box. Okay, what's that? Doh. It's a implementation of an N_ text interface. What's that? Oh, it's a extended version of a Stop it. Might be useful, mightn't it? Yeah, I should have more to talk about. Oh, for next week. No, just after S_P_ two. Yeah, yeah. I don't mean straight after. Yeah, for that. Yeah, that could be quite good. Right. Um Friday morning? Three o'clock. I don't know. Maybe I do, I'm not sure what I'm doing this weekend. Um Two. Should we say three o'clock and then if there's a v serious problem, I'll tell you. It's might not be too we th I don't think we need a probably Three's better than five or six. three so say three o'clock and then um if there's a problem with that, then if three o'clock's a problem, five or six will be a problem, 'cause I won't be here. But I don't think that won't be I'm not sure. Is she collecting them? Oh, you're just sending it to No. I mean, is she collecting oh right, oh yeah, sorry. Yeah. School for the gifted. Yeah. Well I thought somebody was collecting them. But right, well that's what I thought when you said that. Then we'd know who's missed them or who's if we've done any. Yeah. Yeah. No. Yeah, that would help a lot for that s for single terms it would be very useful. For multiple terms, unless you wanna do something there will be a way of doing it for multiple terms. Yeah. Or you could do it and override it by the you ca you could just ignore the d ranking if it doesn't show up together. Or you could perhap you could penalize it, you could just put a b weight against it. Yeah, you do the N_X_T_ so so it doesn't show up together. Either disregard it or put a weighting against it. So if you g how many pairs you get, you can Yeah, so then just next. Yeah. Yeah, might have to talk to Pernilla about that. 'Cause some things you're gonna get a lot of results for. And if the one that you got just happened to be at the bottom when it was actually the most relevant one, something like that would just push it up. Yeah. Which is the highest, exactly. And um also with their something like sunny day, the um if sunny and day aren't mentioned together a lot but sunny just happens to mentions once, then its term will be low and it will push down the other one if you combine them. If you just dup add them together. Yeah, yeah. But you have no ranking system at the moment, so if something's an amazing w highly ranked thing from T_F_I_D_F_, it could just be ignored because it falls off the bottom of the do you have a assuming you only have you have a return all results for all so you type language and it returns seventy five meetings. Yeah, there are already seventy f but there are seventy five meeting. Yeah, yeah. So if it m so yeah, if you return seventy five, wh where do you stop? How do you rank them or something. Or returns twenty, even if it returns twenty, do you cut off at ten, do you rank them, do you what's the threshold? Something like that would be Look through them all, yeah. One at a time and Mm-hmm. Mm-hmm. Yeah. So is that it? We're done. Tick. I've signed off already.